Genomics and Machine Learning for Taxonomy Consensus: The Mycobacterium tuberculosis Complex Paradigm
نویسندگان
چکیده
Infra-species taxonomy is a prerequisite to compare features such as virulence in different pathogen lineages. Mycobacterium tuberculosis complex taxonomy has rapidly evolved in the last 20 years through intensive clinical isolation, advances in sequencing and in the description of fast-evolving loci (CRISPR and MIRU-VNTR). On-line tools to describe new isolates have been set up based on known diversity either on CRISPRs (also known as spoligotypes) or on MIRU-VNTR profiles. The underlying taxonomies are largely concordant but use different names and offer different depths. The objectives of this study were 1) to explicit the consensus that exists between the alternative taxonomies, and 2) to provide an on-line tool to ease classification of new isolates. Genotyping (24-VNTR, 43-spacers spoligotypes, IS6110-RFLP) was undertaken for 3,454 clinical isolates from the Netherlands (2004-2008). The resulting database was enlarged with African isolates to include most human tuberculosis diversity. Assignations were obtained using TB-Lineage, MIRU-VNTRPlus, SITVITWEB and an algorithm from Borile et al. By identifying the recurrent concordances between the alternative taxonomies, we proposed a consensus including 22 sublineages. Original and consensus assignations of the all isolates from the database were subsequently implemented into an ensemble learning approach based on Machine Learning tool Weka to derive a classification scheme. All assignations were reproduced with very good sensibilities and specificities. When applied to independent datasets, it was able to suggest new sublineages such as pseudo-Beijing. This Lineage Prediction tool, efficient on 15-MIRU, 24-VNTR and spoligotype data is available on the web interface "TBminer." Another section of this website helps summarizing key molecular epidemiological data, easing tuberculosis surveillance. Altogether, we successfully used Machine Learning on a large dataset to set up and make available the first consensual taxonomy for human Mycobacterium tuberculosis complex. Additional developments using SNPs will help stabilizing it.
منابع مشابه
Molecular Identification of Mycobacterium Tuberculosis Complex in Formalin-Fixed, Paraffin-Embedded Tissue Blocks of Extra Pulmonary Speciemens using Genomics Extraction
Background: Tuberculosis has been detected in some extra pulmonary ecological niches. Although extra pulmonary tuberculosis (EPTB) is less frequent than Pulmonary Tuberculosis (PTB), its incidence has increased worldwide. The aim of this study was to investigate the presence of EPTB and MDR-EXPT in Formalin-fixed, paraffin-embedded tissue blocks among different samples in Kerma...
متن کاملBiochemical characterization of PE_PGRS61 family protein of Mycobacterium tuberculosis H37Rv reveals the binding ability to fibronectin
Objective(s): The periodic binding of protein expressed by Mycobacterium tuberculosis H37Rv with the host cell receptor molecules i.e. fibronectin (Fn) is gaining significance because of its adhesive properties. The genome sequencing of M. tuberculosis H37Rv revealed that the proline-glutamic (PE) proteins contain polymorphic GC-rich repetitive sequences (PGRS) which have clinical importance i...
متن کاملتشخیص سریع مایکوباکتریومهای آتیپیک در بیماران با علایم سل ریوی: ارزیابی لوکوس (QUB 3232 (590bp با روش VNTR
Background and Objective: Identification of atypical mycobacterium (Non tuberculosis Mycobacterium NTM) is important because of the worldwide propagation of these organisms. Recently, molecular studies have identified the specific loci for mycobacterium species by DNA - finger printing methods, but these methods are time-consuming and expensive. In this study, in addition to hsp65 PCR-RFLP meth...
متن کاملIdentification of Mycobacterium Tuberculosis Complex, Using Molecular Methods
Abstract Background and Objective: A high level of homogeneity observed within all bacteria in the Mycobacterium tuberculosis complex makes a property that seriously challenges traditional biochemical-based identification methods of these pathogens in the laboratory. The work presented here was conducted to characterize Mycobacterium tuberculosis complex isolates in Golestan, Northern Iran. ...
متن کاملارزش تشخیصی تست gyrB-RFLP PCR در تعیین گونه مایکوباکتریومهای بیماریزا در بیماران مسلول در استان مازندران
Background and purpose: Mycobacterium tuberculosis complex (MTBC) members are causative agents of human and animal tuberculosis. Differentiation of MTBC members is essential for appropriate treatment of individual patients and reduce drug resistance. Materials and methods: A total of 1345 samples were collected from patients clinically suspected of contracting tuberculosis that referred to hea...
متن کامل